The effect of feature extraction and data sampling on credit card fraud detection

نویسندگان

چکیده

Abstract Training a machine learning algorithm on class-imbalanced dataset can be difficult task, process that could prove even more challenging under conditions of high dimensionality. Feature extraction and data sampling are among the most popular preprocessing techniques. is used to derive richer set reduced features, while mitigate class imbalance. In this paper, we investigate these two techniques, using credit card fraud four ensemble classifiers (Random Forest, CatBoost, LightGBM, XGBoost). Within context feature extraction, Principal Component Analysis (PCA) Convolutional Autoencoder (CAE) methods evaluated. With regard sampling, Random Undersampling (RUS), Synthetic Minority Oversampling Technique (SMOTE), SMOTE Tomek The F1 score Area Under Receiver Operating Characteristic Curve (AUC) metrics serve as measures classification performance. Our results show implementation RUS method followed by CAE leads best performance for detection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Credit Card Fraud Detection using Data mining and Statistical Methods

Due to today’s advancement in technology and businesses, fraud detection has become a critical component of financial transactions. Considering vast amounts of data in large datasets, it becomes more difficult to detect fraud transactions manually. In this research, we propose a combined method using both data mining and statistical tasks, utilizing feature selection, resampling and cost-...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

Feature engineering strategies for credit card fraud detection

Every year billions of Euros are lost worldwide due to credit card fraud. Thus, forcing financial institutions to continuously improve their fraud detection systems. In recent years, several studies have proposed the use of machine learning and data mining techniques to address this problem. However, most studies used some sort of misclassification measure to evaluate the different solutions, a...

متن کامل

Exploration of Data mining techniques in Fraud Detection: Credit Card

Data mining has been increasing as one of the chief key features of many security initiatives. Often, used as a means for detection of fraud, assessing risk as well. Data mining involves the use of data analysis tools to discover unknown, valid patterns as well as relationships in large data sets. Decades have seen a massive growth in the use of credit cards as a transactional medium. Data mini...

متن کامل

Fuzzy Darwinian Detection of Credit Card Fraud

Credit evaluation is one of the most important and difficult tasks for credit card companies, mortgage companies, banks and other financial institutes. Incorrect credit judgement causes huge financial losses. This work describes the use of an evolutionary-fuzzy system capable of classifying suspicious and non-suspicious credit card transactions. The paper starts with the details of the system u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Big Data

سال: 2023

ISSN: ['2196-1115']

DOI: https://doi.org/10.1186/s40537-023-00684-w